A simulation-driven prediction model for state of charge estimation of electric vehicle lithium battery

Accurately predicting the state of charge (SOC) of lithium-ion batteries in electric vehicles is crucial for ensuring their stable operation. However, the component values related to SOC in the circuit typically require estimation through parameter identification. This paper proposes a three-stage method for estimating the SOC of lithium batteries in electric vehicles. Firstly, the parameters of the constructed second-order RC circuit are identified using the Forgetting Factor Recursive Least Squares (FFRLS) method. Secondly, an innovative approach is employed to construct a battery simulation model using modal-data fusion method. Finally, the predicted values of the simulation model are corrected using the unscented Kalman filter (UKF). Validation through datasets demonstrates the high precision of this method in parameter identification. Moreover, in the comparison of SOC prediction corrections with Particle Filter (PF), Extended Kalman Filter (EKF), and the proposed UKF on simulated prediction data and experimental test data. The proposed method achieves the lowest root mean square error (RMSE) of 0.0025 for simulation prediction data and 0.0186 for experimental test data. It also maintained its error within 5 % on actual data.


Introduction
As the demand for an improved quality of life gradually increases, so too does the demand for vehicles.This surge contributes to the energy and environmental crises we face today [1].The demand for vehicles has increased due to the desire for a better quality of life.However, this surge has exacerbated the energy and environmental crises we face today.Therefore, there has been an increase in the pursuit of new materials, such as nano-related [2,3] and lithium-ion new energy sources [4,5].According to researchers, lithium-ion batteries are being considered as a potential solution for energy storage in transportation, with the aim of reducing environmental pollution caused by the energy crisis.In recent years, lithium-ion batteries with high energy density, charge-discharge performance and stability are usually used as a new choice for Electric Vehicle (EV) batteries, as well as a power source for electric vehicles, and the use of lithium-ion batteries can reduce environmental pollution and save energy [6,7], so they are favored by people.
However, it is important to note that a battery is a complex system of chemical reactions.Its structure comprises various components, including positive and negative electrodes [8], model [9], and internal medium [10].The electrolyte's positive and negative ions move between the electrodes due to the electric current, leading to chemical reactions.However, it is important to note that batteries are chemical reaction systems that are subject to nonlinear and time-varying reactions, which can be influenced by various factors, including temperature.To ensure optimal battery performance, electric vehicles are equipped with their own battery management systems (BMS).The BMS is responsible for a range of functions, including predicting the state of charge (SOC), state of health (SOH), and remaining useful life (RUL) of the battery.
In the realm of lithium battery-related index prediction, methods for estimating State of Health (SOH) have demonstrated notable advancements.For instance, Ref. [11] introduced the ANA-LSTM neural network method, which exhibits remarkable accuracy in predicting the remaining useful life of lithium batteries, even amidst multiple influencing factors.Additionally, Ref. [12] presented a novel framework that integrates Mixers and Bidirectional Temporal Convolutional Neural Network (BTCN) for SOH estimation.Moreover, Ref. [13] proposes an enhanced modeling approach, the robust multi-time scale singular filtering-Gaussian process regression-long short-term memory (SF-GPR-LSTM) method, tailored for precise remaining capacity estimation in low-temperature environments encountered by lithium batteries.
Enlightened by the myriad prediction methods for State of Health (SOH), researchers have extended their application to predict State of Charge (SOC), an area with relatively fewer established methods.Currently, two primary SOC prediction methodologies exist.The first is model-based, involving the construction of circuit models based on battery charging and discharging characteristics, as explored by Ren et al. [14], who proposed a SOC prediction method combining unscented Kalman filtering with initial SOC acceleration convergence.While model-based approaches offer simplicity and reliability, establishing an accurate battery model is crucial for achieving highly precise SOC prediction results.On the other hand, data-driven methods predominantly utilize current and voltage data obtained from battery tests, as demonstrated by Liu et al. [15], who proposed a data-driven SOC prediction method for lithium-ion batteries based on Extended Kalman Filtering (EKF), leveraging machine learning to enhance prediction accuracy.Wu et al. [16], introduced a battery SOC prediction method based on electric vehicle trip data, integrating random forest dimensionality reduction with long short-term memory (LSTM) to enhance prediction robustness.Although widely used, the equivalent circuit model's simplicity belies the significant impact of its parameters on SOC prediction accuracy.Hence, it is imperative to consider these influences of parameters when predicting battery SOC, as emphasized by Tan et al. [17], who proposed an extended Kalman filtering recursive least squares method based on the second-order RC equivalent circuit model to enhance identification accuracy.Additionally, Hu et al. [18], proposed collaborative particle swarm optimization (MCPSO) with hybrid swarm encapsulation to identify and amplify battery parameters for collaborative updating, offering faster convergence to optimal solutions with higher accuracy.Shi et al. [19] employed the forgetting factor recursive least squares method to adaptively adjust battery parameters, enhancing identification accuracy.However, many existing methods overlook the influence on battery performance of internal parameters.Hence, it is proposed that an approach integrates the predictive power of model-based methodologies with real-time data linkage to revolutionize SOC prediction.This innovative fusion allows for rapid battery simulation and comprehensive characterization of parameter SOC relationships, culminating in highly accurate predictions.
In this paper, based on the current and voltage data tested by the lithium battery in the Urban Dynamometer Driving Schedule (UDDS), a second-order RC equivalent circuit was established to describe the dynamic operating characteristics of the battery.The battery parameters were identified by Forgetting Factor Recursive Least Square (FFRLS), and the functional relationship between SOC and each parameter was fitted.Based on the fitting function relation, a simulation model corresponding to the structure function of second-order RC equivalent circuit was established in Simulink to verify the accuracy of parameter identification.Based on the simulation results, the Unscented Kalman Filter (UKF) was used to correct the SOC prediction results, and the optimal prediction curve of battery SOC was obtained.The highlight of the present work is that we present a three-stage method for state of charge estimation of electric vehicle lithium batteries.Detail contributions are summarized as follows.
1) Development of a Refined Three-Step Methodology for SOC Estimation: An elegantly crafted three-step process was proposed that transcends traditional SOC estimation methods by integrating an advanced model-data fusion approach based on the electrical circuit simulation model.This approach ensures a more accurate, reliable, and comprehensive analysis of battery performance.2) Advanced Parameter Identification via the FFRLS Method: A cornerstone of our research is the utilization of the Forgetting Factor Recursive Least Square (FFRLS) method for the identification of parameters in a second-order RC circuit model.This technique allows for the precise determination of the interrelationships between battery components and SOC.By meticulously identifying these parameters, bridge the gap between theoretical models and real-world data.3) Advanced Simulation Model Based on Second-Order RC Circuit for SOC Prediction: Our work introduces a sophisticated simulation model grounded in the parameters of a second-order RC circuit model, enabling precise SOC predictions through model-data fusion.This approach effectively merges theoretical insights with empirical data, enhancing the accuracy of SOC estimates.4) Prediction Accuracy Enhanced with Unscented Kalman Filter: The refinement of SOC predictions is achieved through the utilization of the Unscented Kalman Filter (UKF).This method corrects estimates based on real-time data, significantly improving the accuracy of our SOC predictions by accounting for the nonlinear behavior of lithium batteries.
The subsequent sections of this manuscript are as follow: Section II introduces the theoretical underpinnings and critical methodologies employed, setting the stage for a deeper exploration of our work.Section III is devoted to elucidating the proposed method, detailing the innovative approach.In Section IV, we subject our method to rigorous experimental validation, presenting the results that demonstrate its efficacy.The manuscript concludes with Section V, where we encapsulate our findings, reflect on the implications of our work, and suggest avenues for future research.

Second-order RC equivalent circuit model
This paper employs lithium iron phosphate batteries.Based on their charge-discharge characteristics, a second-order RC equivalent circuit model is established to describe the dynamic operating behavior of the battery [20], as illustrated in Fig. 1.
The second order RC cell model mainly consists of the following parts.I is the battery input current, Ut is the battery output voltage.
UOCV is the open circuit voltage of the battery and has a certain correlation with SOC.R0 is the ohm internal resistance of the battery, which blocks the circulation of current.When the battery is discharged, the ohm internal resistance is the main factor that reduces the capacity of the battery.
R1 and C1 are concentration polarization resistance and concentration polarization capacitance, respectively, which are mainly caused by the resistance of ions in the electrolyte or electrolyte to react on the electrode.
R2 and C2 are electrochemical polarization resistance and electrochemical polarization capacitance, respectively, which come from the effect of slower electrochemical reaction on the electrode.

Parameter identification method
The internal parameters of the battery can't be directly measured by instruments, so it is necessary to use mathematical methods to calculate, using measurable current and voltage data combined with identification algorithm to identify the internal parameters [21][22][23].
In this paper, the least square method with less computation is used to identify battery parameters, but the accuracy of the traditional least square method is low.Therefore, the recursive least square method with amnesia factor was used in this paper to identify battery parameters.The amnesia factor can effectively reduce the influence of invalid data on the identification accuracy, so as to improve the parameter identification accuracy.The parameter identification values corresponding to SOC data points were rounded, and the relationship curve between each parameter and SOC was obtained by using the fitting tool, which was applied to Simulink simulation and SOC prediction.
The FFRLS (Forgetting Factor Recursive Least Squares), as an enhancement over the conventional RLS algorithm [24].This improvement is necessitated by the observation that the RLS method is prone to severe filtering saturation with an increase in the number of sampling instances.Such saturation leads to the parameter of algorithm estimates failing to track time-varying parameters in real-time, consequently diminishing its data correction capabilities.To address this, the FFRLS method incorporates a forgetting factor (0.95<μ < 1) into the foundation of the RLS identification algorithm.This addition aims to mitigate the accumulation of outdated data during iterative computations, thereby amplifying the feedback effect of new data.The fundamental computational formula is represented as follows in Eq. (1): In this equation, y(k +1) denotes the actual system observation value; φ T (k) represents the system data variable; θ(k) signifies the optimum estimation of the parameter variable at the instant k; and e(k) denotes the zero-mean white noise, also referred to as the innovation vector, indicating the discrepancy between optimum prediction of the current moment and the output value of the next instance.
Calculate process of FFRLS is shown at Eq. ( 2), μ is the forgetting factor, usually ranging from 0.95 to 0.99.K k is the gain matrix; P k is the error covariance matrix.Input the open-source current data and voltage data into the algorithm, you can calculate the parameter change value at each moment.
J. Zhang et al.

SOC prediction optimization algorithm
Unscented Kalman Filter (UKF) and Extended Kalman Filter (EKF) are two applications based on Kalman Filter principle [25,26].Different from EKF, UKF has better accuracy and convergence.By obtaining corresponding Sigma points through Untraced Transformation (UT), UKF approximates the posterior probability density of the state [27], thus ensuring the accuracy of filtering prediction and avoiding the complexity of state transition matrix and observation matrix operation [28,29].Here is UKF calculate process.
Step 1: Determine the initial battery system status.Initialize the mean value of the state variable x 0 , the initial value of the covariance matrix P 0 , and the initial values of the mean and variance weights shown in Eq. (3).
In Eq. ( 3), E( ⋅) represents the mean value of each variable as shown in Eq. ( 4).
in which, W m and W c are the mean and variance weights, respectively.λ is the proportional coefficient to reduce the prediction error of system.L is the dimensionality of the state variable depends on the dimension of the state matrix.κ is the scale factor.β is a nonnegative weight coefficient.α is the distribution state of control sampling points, which ranges from 10 − 4 to 1.
Step 2: Calculate the Sigma sampling sites of the state variables at time k and store them are shown in Eq. ( 5).xk is the optimal estimated value of the state variable at time k; P k is the covariance of the state variable at time k, x i sigma,k is the Sigma sampling point at time k.
Step 3: Update the status time.According to Eq. ( 6) to Eq. ( 8), 2L+1 Sigma sampling points calculated in Step 2 were used to update the mean value of the state variable x i pred,k|k− 1 and the covariance Ppred at time k.
where Q k− 1 is the mean value of process noise of battery system.k|k − 1 indicates that the variable is based on the time value of the variable before updating the values of the variables of the next moment.x i sigmapre.k|k− 1 is the predicted value of Sigma point set based on k-1 time.
Step 4: Update the observation time and calculate the mean of observation prediction ŷk|k− 1 and the covariance are shown in Eq. ( 9) to Eq. (12).
where R k− 1 is the mean value of observation noise of battery system.ŷk|k− 1 is the observed predicted value based on the k-1 time, Py,k and Pxy,k are the observed covariances at time k.
Step 5: Calculate the unscented Kalman filter gain matrix k at time K k as shown in Eq. (13).
Step 6: State covariance prediction update.Calculate the updated optimal predictive value of the state variable xk and the optimal covariance matrix Pk at time k, as shown in Eq. ( 14) and Eq.(15).
The above six steps can complete Sigma point sampling and state prediction update in the UKF algorithm, that is, the optimal predicted value of the UKF of the battery SOC at time k can be obtained from the optimal predicted value of the state variable xk .

Proposed three-step method
In order to better solve the SOC prediction problem of lithium-ion batteries, this paper proposes a three-step method for estimating the state of charge of lithium-ion batteries in electric vehicles.The flow chart is shown in Fig. 2, and the detailed steps are summarized as follows.Through these steps, not only is the construction and validation of an accurate second-order RC circuit model that reflects dynamic working characteristics of batteries achieved, but also the effective prediction of the SOC is facilitated, providing accurate methodological support for battery management systems.

Introduction of dataset
Urban Dynamometer Driving Schedule (UDDS), referred to as FTP72 condition, is a test procedure used by the US environmental protection agency in 1972 to certify the vehicle plaque, and later it is often used as one of the test conditions for lithium batteries J. Zhang et al. [30][31][32].In this paper, the open-source data of battery current and voltage measured in UDDS of the United States was adopted on the scientific research data management and sharing platform of Elsevier company Mendeley data.
The data source of this paper is the UDDS in the United States on research data management and sharing platform of Mendeley.The current, voltage, battery capacity and other open-source data of Panasonic 18650PF lithium battery (battery type is lithium iron phosphate battery) measured in UDDS are used as research data, as shown in Fig. 3(a) to Fig. 3(c).
The data link is: https://data.mendeley.com/datasets/wykht8y7tg/1.The parameters of Panasonic 18650PF lithium battery is shown in Table 1.

Parameter identification based on FFRLS
The data derived from the UDDS operational conditions exhibit notable regularity.Notably, a 10 % reduction in battery capacity corresponds to a consistent pulse current value, as illustrated in Fig. 4(a), and falls within a defined voltage range, as shown in Fig. 4 (b).The variance between consecutive pulse current measurements remains remarkably stable, ranging between 0.00081 and 0.00083.Drawing on these findings, this study employs the specified current and voltage parameters in conjunction with the FFRLS algorithm to facilitate the online identification of battery parameters.
According to voltage law and current law of Kirchhoff, the battery state equation based on the second-order RC equivalent circuit model can be obtained, as shown in Eq. (16).
The transfer function of the battery system can be obtained by applying Laplace transform to Eq. ( 16), as shown in Eq. ( 17).
1+z − 1 , the transfer function is changed bilinear.Make τ S = R S C S , τ P = R P C P , a linearized equation of state can be obtained, as shown in Eq. ( 18) and Eq.(19).
The linear equation of state is discretized, as shown in Eq. (20).Then the data matrix φ k and parameter matrix θ k of the battery system can be obtained, and the input-output equation of FFRLS can be established, as shown in Eq. ( 21) and Eq. ( 22).
Therefore, the input-output equation of FFRLS is shown in Eq. ( 23), and the calculation equation is shown in Eq. ( 2).
Above all, FFRLS was employed to identify the parameters.This method allows for obtaining the variation values of battery parameters throughout all times, along with changes in current and voltage.However, the resulting data volume post-identification is considerable.To mitigate the computational complexity associated with Simulink simulation and UKF prediction of SOC, it becomes imperative to establish the relationship between parameters and SOC changes.In this study, parameter identification values were selected with SOC ranging from 0 to 100 % in increments of 10 %.Post FFRLS parameter identification, the mathematical relationship between each parameter and SOC is delineated in Eq. ( 24) to Eq. ( 29).The parameter identification values for each parameter are presented in Table 2.
An interesting phenomenon can be seen from Table 2.After the battery discharge, the resistance values of internal polarization resistance R1 and R2 both showed a sudden drop, and then tended to be stable, without sudden fluctuations.The capacitance value of  C1 increases first after the battery discharge.When the SOC reaches 50 %, the capacitance value of C1 decreases significantly.When the SOC drops to 10 %, the capacitance value of C1 rises again to the state before discharge.The polarization capacitance C2 value increases to a stable value after the battery is discharged, and continues to decline when the SOC reaches 50 %.Thus, when SOC = 50 %, most of the parameters are mutated.That is to say, when the battery capacity reaches 50 %, the chemical reaction rate and effect inside the battery will be affected to some extent.

Simulink Simulation based on model
The operational condition of the battery is primarily indicated by changes in the terminal voltage [33,34].Therefore, Simulate the working state of the battery using a battery system simulation model, and verify the accuracy of parameter identification and the robustness of the second-order RC equivalent circuit model.According to the existing mathematical relationship between each parameter and SOC, the full response law of resistance-capacitance network, and the charge-discharge characteristics of the battery.Simulink is employed here to simulate battery charging and discharging characteristics.Moreover, Simulink is simple to perform without consider the chemical reactions in batteries.Therefore, a simulation model based on second-order RC equivalent circuit is established in Simulink.Fig. 5 shows the overall architecture of the model.The SOC simulation module in the model is composed according to the definition of SOC, as shown in Eq. (30).
in Eq. ( 30), SOC 0 is the initial battery SOC, η is the charging and discharging efficiency.For lithium batteries, the charging and discharging efficiency is basically 1, and Q is the battery power.The input signal of the simulation prediction model is the UDDS test current, and the Gaussian band-limited white noise signal Fig. 5. Battery simulation model framework.
J. Zhang et al.
module is added to simulate the real UDDS test condition.Because the normally distributed random numbers generated by the Gaussian band-limited white noise module are applicable to continuous or mixed systems [35].The battery system is a continuous nonlinear system, so adding noise can express the actual working state of the battery more clearly.The actual capacity of the battery used in this paper is 2.7015Ah, so the total charge and discharge charge of the battery is 9725.364C.The initial capacity of the battery before the test, that is, the initial SOC, is 97.24 %.These are used as the initial amount of the battery system to simulate the changes in terminal voltage and SOC of the battery system.Fig. 6(a) and (b) shows the comparison between simulation and experimental results, Fig. 7(a) and (b) shows the error between them.
As can be seen from Fig. 7(a) and (b), there are still some errors in simulating the working state of battery SOC and simulated battery by using Simulink.The initial simulation results are close to the experimental results, but when the battery capacity is gradually exhausted, the simulation errors of SOC gradually increase, even up to 0.054.The simulation error of terminal voltage fluctuates between 0 and 0.5V, but when the battery capacity is about to be exhausted, the error can reach up to 0.8V.
It can be found from the experimental test results that when the battery SOC drops to about 20 %, the battery SOC decreases gradually and slowly, and the frequency of decline gradually decreases.The average amplitude of the simulation SOC is basically similar, but when the battery capacity is close to 20 %, there is a big difference between the predicted results of the simulation and the experimental results.Generally, when measuring SOC, the voltage of the battery is not affected.However, if the SOC is less than 20 %, the characteristics of the battery will be sharply changed by the weakness of electrolyte consumption and side reactions in electrolytic reaction.
Therefore, when the battery capacity is low, the simulation model will have a large error in predicting SOC and battery working state.When the SOC prediction error is large, it will also have a great impact on the charging and discharging state of the battery.

UKF improve accuracy of SOC prediction
From the analysis of the previous results, it is necessary to consider using UKF to correct the predicted values of the model.According UKF to predict SOC, the lithium battery system should be discretized to obtain the state equation and observation equation of the discretized system, which present in Eq. (31). { in which, x k is the state equation representing the state of the battery system at time k, y k is the observation equation representing the observed output of the battery system at time k, such as the battery terminal voltage.u k represents the input to the battery system, such as the battery current, w k and v k are independent process noise and observation noise, respectively.A k and В k are the state matrices of the battery system, C k and D k are the observation matrices of the battery system.According to Eq. ( 30), the state space model of the battery system can be expressed in the form of discrete time equation, as shown in Eq. (32).In Eq. ( 32), U i.k is the terminal voltage at time k of two RC networks.I k is the current at moment k of the circuit.
The predicted results of battery SOC obtained by untraced Kalman filtering algorithm are shown in Fig. 8.The error between UKF prediction results and Simulink simulation prediction results are shown in Fig. 9(a), and the error between UKF prediction results and experimental test results is shown in Fig. 9(b).
The prediction accuracy of battery operating state depends on the prediction accuracy of battery SOC.The error between the SOC simulation results and the experimental results of the battery equivalent circuit model established by Simulink fluctuates between − 5% and 5 %, but when the battery capacity is about to run out, the error can reach about 20 %.
However, when the untracked Kalman filter is used to predict the SOC of the battery, the error between the SOC results and the experimental results fluctuates between 0.5 % and 0 %, and the maximum error is not more than 1 %.The error between SOC and To elucidate the rationale for selecting the UKF for adjustment purposes, a comparative analysis was conducted against other pertinent methodologies, with the findings detailed in Table 3, focusing on the rectification of the SOC prognostications for model.Fig. 10 graphically represents the outcomes of this comparison.Moreover, to quantitatively assess the accuracy of these methods, the Root Mean Squared Error (RMSE) associated with each technique are systematically tabulated in Table 4.As mentioned before, since the battery is a non-linear system, the Kalman filtering method is not compared here.
The results presented in Table 4 provide a comprehensive comparison of the RMSE values, contrasting the forecasted SOC obtained through UKF correction with both simulated predictions and experimental data.Notably, the UKF demonstrates exceptional precision, yielding RMSE values of 0.0025 and 0.0186 when compared against simulated and experimental signals.These findings not only represent numerical measures but also signify the remarkable capability of UKF to maintain minimal error margins, thus affirming its superior accuracy in SOC tracking.Such metrics are not merely numerical but are a testament to ability of the UKF to sustain error margins within a minimal bracket, thereby underscoring its finesse in tracking the SOC with heightened accuracy.
Moreover, the fidelity of the UKF is further exemplified in Fig. 10, which elucidates tracking trajectory of the filter, marked by negligible perturbations, and an unwavering adherence to the experimental curve.This steadfastness is emblematic of robust error covariance handling for the UKF, allowing it to mitigate the propagation of uncertainties throughout the estimation process.
Conversely, the Extended Kalman Filter (EKF), while demonstrating laudable precision, exhibits a subtly higher RMSE value, which, in the graphical representation, translates to a slightly more pronounced oscillatory behavior around the ground truth values.Such fluctuations, although contained, hint at comparatively less precise convergence of the PF to the SOC path.
Similarly, the PF, despite its capability to chart the overarching SOC trend, portrays the most significant departures from the test values, as evidenced in Fig. 10 by its broader error band.This variability, observable as the most pronounced waveforms around the true SOC trajectory, indicates a higher susceptibility to model and measurement noises, affecting its predictive consistency.
Therefore, the prediction of battery SOC by UKF has higher accuracy, which provides higher precision prediction results for battery management system, and provides safety guarantee for the smooth operation of electric vehicle.

Conclusion
This study deepens the comprehension of the charge-discharge behaviors exhibited by Lithium Iron Phosphate (LFP) batteries through the introduction of an innovative three-stage methodology rooted in model-data fusion.In conclusion, this study has established a robust three-stage method for the accurate estimation of the SOC in lithium-ion batteries, crucial for the operational stability of electric vehicles.By integrating the FFRLS method for parameter identification, a modal-data fusion approach for simulation model construction, and corrections via the UKF, this methodology demonstrates superior precision and reliability.Comparative analyses underscore efficacy of the UKF, yielding the lowest RMSE values of 0.0025 for simulations and 0.0186 for experimental data, while consistently maintaining errors below 5 % against actual data.These findings pave the way for future advancements in battery management systems.
Future research should address the intricate effects of temperature, operational conditions, and other environmental influences on battery performance.Efforts to extend the adaptability of this methodology to diverse scenarios will not only enrich its robustness and applicability but also pave the way for a more nuanced understanding of battery dynamics.Anticipated developments may include innovative adaptive strategies for energy management, tailored to meet the evolving technological demands and environmental considerations.

Fig. 9 .
Fig. 9. Error curve: (a) The error between UKF prediction results and simulation prediction results; (b) The error between UKF prediction results and experimental test results.

Table 1
Battery parameters value.

Table 2
Battery parameter identification values.
J. Zhang et al.